This comprehensive analysis examines the temporal evolution of LEGO set releases, exploring various dimensions such as set count, themes popularity, color variety, and complexity. The analysis utilizes data spanning several decades, emphasizing trends and fluctuations in the LEGO product strategy.
The LEGO Group has significantly increased the number of set releases per year since 2000, with notable market responsiveness observed during the COVID-19 pandemic, as seen in the peaks of 2020 and 2021.
The analysis identifies a surge in product diversity, highlighted by the growing number of unique colors per set, indicating a strategic pivot towards more visually engaging and complex offerings.
Thematic trends reveal that certain series, such as DC Super Heroes and various Minifigures series, dominate release counts, reflecting both sustained popularity and strategic brand partnerships.
An examination of the most popular colors shows a dominance of traditional colors, with black and white leading, suggesting these colors’ foundational roles in set design.
The correlation between set size and color variety is positive, indicating that larger sets tend to offer a greater spectrum of colors, adding to their complexity and appeal.
Dataset size:
## [1] 263 4
## id name rgb is_trans
## Min. : -1.0 Length:263 Length:263 Length:263
## 1st Qu.: 83.0 Class :character Class :character Class :character
## Median :1005.0 Mode :character Mode :character Mode :character
## Mean : 651.4
## 3rd Qu.:1070.5
## Max. :9999.0
Dataset size:
## [1] 60456 4
## element_id part_num color_id design_id
## Min. : 9327 Length:60456 Min. : -1.0 Min. : 1001
## 1st Qu.:4565425 Class :character 1st Qu.: 10.0 1st Qu.: 18454
## Median :6111350 Mode :character Median : 28.0 Median : 41748
## Mean :5517587 Mean : 120.4 Mean : 45570
## 3rd Qu.:6286413 3rd Qu.: 85.0 3rd Qu.: 75474
## Max. :6499141 Max. :9999.0 Max. :107520
Dataset size:
## [1] 37265 3
## id version set_num
## Min. : 1 Min. : 1.000 Length:37265
## 1st Qu.: 14424 1st Qu.: 1.000 Class :character
## Median : 54379 Median : 1.000 Mode :character
## Mean : 61104 Mean : 1.091
## 3rd Qu.: 88842 3rd Qu.: 1.000
## Max. :194312 Max. :16.000
Dataset size:
## [1] 20858 3
## inventory_id fig_num quantity
## Min. : 3 Length:20858 Min. : 1.000
## 1st Qu.: 7869 Class :character 1st Qu.: 1.000
## Median : 15681 Mode :character Median : 1.000
## Mean : 43010 Mean : 1.062
## 3rd Qu.: 66834 3rd Qu.: 1.000
## Max. :194312 Max. :100.000
Dataset size:
## [1] 1180987 6
## inventory_id part_num color_id quantity
## Min. : 1 Length:1180987 Min. : -1.0 Min. : 1.00
## 1st Qu.: 9404 Class :character 1st Qu.: 4.0 1st Qu.: 1.00
## Median : 22838 Mode :character Median : 15.0 Median : 2.00
## Mean : 50849 Mean : 131.8 Mean : 3.37
## 3rd Qu.: 87088 3rd Qu.: 71.0 3rd Qu.: 4.00
## Max. :194312 Max. :9999.0 Max. :3064.00
## is_spare img_url
## Length:1180987 Length:1180987
## Class :character Class :character
## Mode :character Mode :character
##
##
##
Dataset size:
## [1] 4358 3
## inventory_id set_num quantity
## Min. : 35 Length:4358 Min. : 1.000
## 1st Qu.: 8076 Class :character 1st Qu.: 1.000
## Median : 16423 Mode :character Median : 1.000
## Mean : 52519 Mean : 1.813
## 3rd Qu.: 98685 3rd Qu.: 1.000
## Max. :191576 Max. :60.000
Dataset size:
## [1] 13764 4
## fig_num name num_parts img_url
## Length:13764 Length:13764 Min. : 0.000 Length:13764
## Class :character Class :character 1st Qu.: 4.000 Class :character
## Mode :character Mode :character Median : 4.000 Mode :character
## Mean : 5.296
## 3rd Qu.: 5.000
## Max. :156.000
Dataset size:
## [1] 66 2
## id name
## Min. : 1.00 Length:66
## 1st Qu.:19.25 Class :character
## Median :35.50 Mode :character
## Mean :35.36
## 3rd Qu.:51.75
## Max. :68.00
Dataset size:
## [1] 29977 3
## rel_type child_part_num parent_part_num
## Length:29977 Length:29977 Length:29977
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
Dataset size:
## [1] 52615 4
## part_num name part_cat_id part_material
## Length:52615 Length:52615 Min. : 1.00 Length:52615
## Class :character Class :character 1st Qu.:17.00 Class :character
## Mode :character Mode :character Median :41.00 Mode :character
## Mean :38.91
## 3rd Qu.:60.00
## Max. :68.00
Dataset size:
## [1] 21880 6
## set_num name year theme_id
## Length:21880 Length:21880 Min. :1949 Min. : 1
## Class :character Class :character 1st Qu.:2001 1st Qu.:273
## Mode :character Mode :character Median :2012 Median :497
## Mean :2008 Mean :442
## 3rd Qu.:2018 3rd Qu.:608
## Max. :2024 Max. :752
## num_parts img_url
## Min. : 0.0 Length:21880
## 1st Qu.: 3.0 Class :character
## Median : 31.0 Mode :character
## Mean : 161.4
## 3rd Qu.: 139.0
## Max. :11695.0
Dataset size:
## [1] 323 3
## id name parent_id
## Min. : 3.0 Length:323 Min. : 1.0
## 1st Qu.:205.0 Class :character 1st Qu.:186.0
## Median :469.0 Mode :character Median :411.0
## Mean :419.9 Mean :360.6
## 3rd Qu.:632.5 3rd Qu.:512.5
## Max. :751.0 Max. :697.0
Conclusions:
We can observe a rapid increase in the number of sets released after 2000. The graph shows that the release frequency became more volatile, with significant peaks and troughs.
In the most recent years displayed on the graph, there is a notable fluctuation with sharp increases in the number of sets followed by declines. This could be due to various factors such as market strategy, changes driven by COVID-19 pandemic, where we can see a decent increase of sets in 2020 and 2021 years.
## `summarise()` has grouped output by 'year'. You can override using the
## `.groups` argument.
## Selecting by n
## `summarise()` has grouped output by 'name'. You can override using the
## `.groups` argument.
Conclusions:
The plot shows a clear upward trend in the average number of unique colors used in Lego sets from around the 1950s to the present. Notably, there is a significant increase starting in the early 2000s, where the average number of unique colors per set rises more steeply compared to previous decades.
This trend could be indicative of Lego’s strategy to make sets more appealing and varied, perhaps in response to market demands for more intricate and visually stimulating products.
## # A tibble: 10 × 4
## id set_num num_minifigs name
## <int> <chr> <int> <chr>
## 1 2579 9293-1 29 Community Workers
## 2 2267 852293-1 28 Fantasy Era Castle Giant Chess Set
## 3 100622 76178-1 25 Daily Bugle
## 4 5411 1063-1 24 Community Workers
## 5 7869 75159-1 23 Death Star
## 6 2154 9349-1 22 Fairytale and Historic Minifigures
## 7 7649 9348-1 22 Community Minifigures
## 8 10402 3425-2 22 Grand Championship Cup
## 9 10538 3425-1 22 Grand Championship Cup - U.S. Men's Team Cup Ed…
## 10 85198 71741-1 22 NINJAGO City Gardens
## Warning: Removed 1 row containing missing values (`geom_line()`).
## Warning: Removed 1 rows containing missing values (`geom_point()`).
Conclusions:
For a significant period, specifically from the early years displayed up to around the late 1970s, the average number of minifigures per set remained relatively constant and close to 1.
Starting from the early 1980s, there is noticeable variability, with the average number of minifigures per set fluctuating more significantly. The fluctuations appear to be somewhat cyclical with peaks and troughs.
The trend becomes more pronounced in later years, with the variability increasing, which could be indicative of more diverse set offerings, special editions, or changes in set design philosophy.
Conclusion:
The heatmap suggests that there is a positive correlation between the size of a Lego set (as measured by the number of parts) and the color diversity within the set (as measured by the number of unique colors). Sets that have a higher number of parts tend also to have a higher number of different colors.
Complexity of a set can be achieved by approximate the number of unique part categories used in each set.
## `geom_smooth()` using formula = 'y ~ x'
## [1] 0.5391932
Conclusion:
The scatter plot reveals a positive correlation between the number of parts and set complexity. As the number of parts in a set increases, the number of unique part categories tends to increase as well, suggesting that larger sets are generally more complex. This relationship seems to hold strongly for sets with a smaller number of parts, as indicated by the dense cluster of points toward the origin, where the increase in complexity with the number of parts is quite pronounced.
For sets with a very high number of parts (toward the right end of the X-axis), the data points become more spread out, indicating more variability in complexity for these larger sets. It suggests that once a set reaches a certain size, the addition of more parts does not necessarily increase complexity at the same rate. This could be due to the use of repeated parts within these large sets or a design choice to not increase complexity despite a higher part count.
Over the years, the LEGO Group has navigated through various market trends and consumer preferences, as evidenced by the evolution of its set designs. The data shows an overall trend towards increased complexity and diversity in set offerings, which can be attributed to the following key developments:
There has been a clear trajectory towards incorporating a broader palette of colors in sets, coupled with an increase in the number of parts.
LEGO has shown adaptability in its themes, often aligning with popular culture and consumer interests, as seen in the consistent popularity of certain minifigure series and licensed themes.
Fluctuations in set releases correlate with global events, such as the pandemic, where LEGO appeared to capitalize on increased demand for indoor activities, as well as adjust its product strategy in response to the economic climate.
The changing number of minifigures and the variability in sets suggest a strategic focus on consumer engagement, offering a mix of both collector-focused and play-intensive sets.
In conclusion, LEGO’s ability to evolve while retaining its core principles has allowed it to remain a leader in the toy industry, consistently delivering products that resonate with consumers of all ages.
I will use Prophet ML model to predict the number of released sets from the current year up to 2030.
## Disabling weekly seasonality. Run prophet with weekly.seasonality=TRUE to override this.
## Disabling daily seasonality. Run prophet with daily.seasonality=TRUE to override this.
## ds yhat yhat_lower yhat_upper
## 52 2024-01-01 403.7481 274.8425 516.4237
## 53 2025-01-01 412.1880 292.4113 538.1468
## 54 2026-01-01 419.8156 299.5095 545.3976
## 55 2027-01-01 429.3019 305.5758 556.2788
## 56 2028-01-01 440.6468 316.2936 565.1614
## 57 2029-01-01 449.0867 322.2012 576.1845
## 58 2030-01-01 456.7144 333.9980 575.2458
## 59 2031-01-01 466.2007 328.9045 582.5795
## 60 2032-01-01 477.5456 340.5237 602.0792
## 61 2033-01-01 485.9855 352.2446 607.9193
Conclusion:
The prediction suggests a continued increase in the number of Lego sets released each year.
If the trend observed in the past continues without significant change, we might expect to see a rise in the number of Lego sets released annually up to 2030. However, predictions should be taken with caution due to the potential impact of unforeseen future events.